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ABSTRACT 

Relying  on  one  validation  and  verification  (V&V)  alone  cannot  detect  all  of  the  security  problems  of  a  software  system.  Each  class  of  V&V 
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of  underlying  vulnerabilities  in  software.  The  alerts  produced  by  automated  static  analysis  (ASA)  tools  and  other  static  metrics  have  been 
shown  to  be  an  effective  estimator  of  the  actual  reliability  in  a  software  system.  Predictions  of  defect  density  and  high-risk  components  can 
be  identified  using  static  analyzers  early  in  the  development  phase.  Our  research  hypothesis  is  the  actual  number  of  security  vulnerabilities 
in  a  software  system  can  be  predicted  based  upon  the  number  of  security-related  alerts  reported  by  one  or  more  static  analyzers  and  by  other 
static  metrics.  We  built  and  evaluated  statistical  prediction  model  are  used  to  predict  the  actual  overall  security  of  a  system. 
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Predicting  Attack-prone  Components  with  Source  Code  Static  Analyzers 

Statement  of  Problem  Studied 

No  single  vulnerability  detection  technique  can  identify  all  vulnerabilities  in  a  software 
system.  However,  the  vulnerabilities  that  are  identified  from  a  detection  technique  may  be 
predictive  of  the  residuals.  We  focus  on  creating  and  evaluating  statistical  models  that 
predict  the  components  that  contain  the  highest  risk  residual  vulnerabilities. 

The  cost  to  find  and  fix  faults  grows  with  time  in  the  software  life  cycle  (SLC).  A 
challenge  with  our  statistical  models  is  to  make  the  predictions  available  early  in  the  SLC  to 
afford  for  cost-effective  fortifications.  Source  code  static  analyzers  (SCSA)  are  available 
during  coding  phase  and  are  also  capable  of  detecting  code-level  vulnerabilities.  We  use  the 
code-level  vulnerabilities  identified  by  these  tools  to  predict  the  presence  of  additional 
coding  vulnerabilities  and  vulnerabilities  associated  with  the  design  and  operation  of  the 
software.  The  goal  of  this  research  is  to  reduce  vulnerabilities  from  escaping  into  the  field 
by  incorporating  source  code  static  analysis  warnings  into  statistical  models  that  predict 
which  components  are  most  susceptible  to  attack. 

The  independent  variable  for  our  statistical  model  is  the  count  of  security-related  source 
SCSA  warnings.  We  also  include  the  following  metrics  as  independent  variables  in  our 
models  to  determine  if  additional  metrics  are  required  to  increase  the  accuracy  of  the  model: 
non-security  SCSA  warnings,  code  chum  and  size,  the  count  of  faults  found  manually  during 
development,  and  the  measure  of  coupling  between  components.  The  dependent  variable  is 
the  count  of  vulnerabilities  reported  by  testing  and  those  found  in  the  field. 


Summary  of  Most  Important  Results 


We  evaluated  our  model  on  three  commercial  telecommunications  software  systems. 
Two  case  studies  were  performed  at  an  anonymous  vendor  and  the  third  case  study  was 
performed  at  Cisco  Systems.  Each  system  is  a  different  technology  and  consists  of  over  one 
million  source  lines  of  C/C++  code.  The  results  show  positive  and  statistically  significant 
correlations  between  the  metrics  and  vulnerability  counts.  Additionally,  the  predictive 
models  produce  accurate  probability  rankings  that  indicate  which  components  are  most 
susceptible  to  attack.  The  models  are  evaluated  with  receiver  operating  characteristic  curves 
where  each  case  study  showed  over  92%  of  the  area  was  under  the  curve.  We  also  performed 
five-fold  cross-validation  to  further  demonstrate  statistical  confidence  in  the  models.  Based 
on  these  results  we  contribute  the  following  theory: 

Theory:  Above  a  statistically  detennined  threshold,  SCSA  vulnerability  warnings  are  in 
the  same  components  as  vulnerabilities  that  are  likely  to  be  exploited. 

Components  that  contain  security-related  warnings  identified  by  SCSA  are  also  likely  to 
contain  other  exploitable  vulnerabilities.  Software  engineers  should  systematically  inspect 
and  test  code  for  other  vulnerabilities  when  a  security-related  warning  is  present.  Fortifying 
these  vulnerabilities  may  facilitate  other  techniques  to  identify  more  undetected 


vulnerabilities. 
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